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Abstract 

We study the three-dimensional persistent random walk with drift. Then we de- 
velop a thermodynamic model that is based on this random walk without assuming 
the Boltzmann-Gibbs form for the equilibrium distribution. The simplicity of the 
model allows us to perform all calculations in closed form. We show that, despite 
its simplicity, the model can be used to describe different polymer stretching ex- 
periments. We study the reversible overstretching transition of DNA and the static 
force-extension relation of the protein titin. 



1 Introduction 

Nowadays, one can experimentally manipulate individual molecules. One can 
measure force-extension relations of biopolymers under varying circumstances. 
Double stranded DNA is intensively studied in the literature. Mostly the Worm 
Like Chain model is used to describe the properties of DNA in a good sol- 
vent. The theoretical and experimental force-extension relations are in good 
agreement at relatively low forces pQ. When one applies a large force (around 
65 pN) on double stranded DNA, one observes an abrupt increase of the con- 
tour length of the molecule [2|3] . This marks a transition from the standard 
B-form of DNA to a so called S-form of DNA. In [3], one performs stretching 
and releasing experiments on the same molecule under different solvent con- 
ditions. Often the stretch and release curves do not coincide and hysteresis is 
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observed. One can eliminate this hysteresis by increasing the salt concentra- 
tion of the solvent. Then one can assume that the experiment is performed 
under equilibrium conditions. The salt dependence of the overstretching tran- 
sition is further studied in [I] . The temperature dependence of this transition 
is studied in [5]. 

The reversible B-DNA to S-DNA transition is studied theoretically by different 
authors. Typically the developed models contain several adjustable parame- 
ters. We will make a strict distinction between 'a posteriori' (called posteriors) 
and 'a priori' parameters (called priors). The former have the usual meaning 
of parameters that are changed in order to obtain a good fit of a theoretical 
curve to an experimental dataset. The latter have the meaning that the values 
of the parameters can be considered as fixed and are not used in the fitting 
procedure. In an attempt to explain the obtained experimental data of the 
B-DNA to S-DNA transition, a pure two-state model is used in [2J. The ap- 
plication of this model is limited to the transition region. For this reason, the 
model is combined in [6] with the well known Worm Like Chain model. The 
combination of the two models results in a good fit to the experimental data 
of [2J with 2 posteriors and 3 priors. Another theoretical model is introduced 
in [7]. This so called Discrete Persistent Chain model, combines features from 
the Worm Like Chain model and the Freely Jointed Chain model and contains 
7 posteriors (although some of them could have been used as priors). As a con- 
sequence of the large number of parameters, it is no surprise that a very good 
fit to one of the experimental datasets of [3] is obtained. In [8], the authors 
argue that one has to include salt effects to explain all the datasets of [I], 
obtained under different solvent conditions. Starting from a phenomenologi- 
cal expression for the free energy, the authors obtain theoretical curves that 
depend on 4 posteriors and several priors. A good fit to 7 different datasets 
of [1] is obtained. The fit parameter that is the most sensitive to the salt 
concentration is the effective length of charge separation. The range of values 
of this parameter obtained by the fit to the 7 datasets is in agreement with 
previous studies. The authors conclude that their ansatz for the free energy is 
physically meaningful. 

Another molecule that is often studied in the literature is the protein titin. 
It contains approximately 30.000 amino acids. The most important part of 
this protein is the so called PEVK region that is flanked by immunoglobulin 
domains. The PEVK region is a chain of amino acids that behaves like a 
random coil. The immunoglobulin domains are folded parts of the protein. 
In [H] , the static (equilibrium) force-extension relation is measured. Although 
the resulting curve is non-trivial, surprisingly one usually only studies in the 
literature [10] the dynamic force-extension relation of this protein. In this 
paper we will develop a thermodynamic model that is based on the three- 
dimensional random walk, to describe the static force-extension relation of 
titin. 



2 



In [TT], the present authors obtained analytical results for the one-dimensional 
persistent random walk with drift. It is shown in [T2fT3"] that this walk can be 
used as a qualitative model for a polymer in solution. In the present paper we 
study the three-dimensional persistent random walk with drift. We will show 
that it can be used to study the force-extension relations of overstretched DNA 
and of the protein titin. The persistent random walk is intensively studied in 
the literature [T¥|15|I16|[TT] . However, to the best of our knowledge, this is the 
first time that the persistent random walk is used to study the aforementioned 
force-extension relations. 

In the next section our mathematical model is introduced. The section starts 
with a brief summary of the most important results of [18J. The application of 
these results to the three-dimensional persistent random walk with drift then 
follows. In section[3l we establish the connection between the parameters of our 
mathematical model and the thermodynamic control parameters of interest. 
In sections H] and the applications of our formalism are studied. The last 
section gives a short discussion of the results. 



2 Mathematical model 

In [18] Markov chains with a finite number of states are studied. A Markov 
chain with state space T is determined by initial probabilities p(x) (x G T), 
and by transition probabilities w(x,y) (x,y G T), with 

1 = ^p(x)andl = Y^w{x,y). (1) 

x<=r yer 

The probabilities p(x) are used as initial values for the equation of motion 

Pt+i(x) = ^2Pt(y)w(y,x). (2) 
They are called stationary if the following equation holds 



PW = E?'(!/)«'(!/. :c ), (3) 

The record of transitions k is defined [18] as a sequence of numbers k x>y , one 
for each pair of states x, y, counting how many times the transition from x to 
y is contained in a given path of the Markov chain. One can prove following 
properties for stationary Markov chains [IB] 
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(k X} y)=np(x)w(x,y) 

S = —n p(x)w(x, y) lnw(x, y) 

-^2p(x)\np(x), (4) 

with n the total number of chain elements, S the standard Boltzmann-Gibbs 
entropy and (.) an average over phase space. 

Consider now a discrete, three-dimensional random walk with transition prob- 
abilities which depend only on the direction of the present and of the previous 
step. This means that the process of the increments is Markovian. The state 
space of the latter process contains 6 elements and is 



r = {x+,x-,y+,y-,z+,z-}. (5) 

To clarify this notation, x+ has the meaning of a step in the positive in- 
direction, while x— is a step in the negative x-direction. The Markov chain 
is determined by 6 x 6 = 36 different transition probabilities. To reduce this 
number we first assume that the walker cannot turn back. This is equivalent 
with following constraints on the transition probabilities 



— w(x+, x—) = w(y+, y—) = w(z+, z—) 
= w(x—,x+)=w(y—,y+)=w(z—,z+). (6) 

For the applications, studied in the present paper, it is important to break 
the symmetry in one of the spatial directions and to distinguish between steps 
that go straight on or change direction. So, to further reduce the number of 
transition probabilities we can assume that the states y+, y—, z+ and z— are 
equivalent. Then define the following shorthand notation 



: w(x+, y+) = w(x+, y—) = w(x+, z+) = w(x+, z—) 

■ w(x—, y+) = w(x—, y—) = w(x—, z+) = w(x—, z—) 

■ w(y+, x+) = w(y—, x+) = w(z+, x+) = w(z—, x+) 
w(y+,x—) = w(y—,x—) = w(z+,x—) = w(z—,x—) 
: w(y+, z+) = w(y-, z+) = w(y+, z-) = w(y-, z-) 

w{z+, y+) = w(z-, y+) = w(z+, y-) = w(z-, y-). (7) 



Using the normalisation condition (pQ) one obtains following expressions for 
the remaining 6 transition probabilities 



w(y+, y+) = w(z+, z+) = w(y—, y—) = w(z—, z—) = 1 — a — 7 — 29 
w(x+, x+) = 1 — 4e, w(x— , x— ) = 1 — 4/i. (8) 
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Fig. 1. Plot of the average endposition (A), the average number of changes of direc- 
tion (B) and the entropy (C) as a function of the force. For all figures, the value of 
the temperature equals T = 0.05 for the solid line and T = 0.5 for the dotted line. 
(h = -1, a = b= 1) 

So now we are left with a 5-parameter model. The stationary probabilities can 
be calculated by solving the set of equations together with the normalisa- 
tion condition (JTJ). The result is 



p(y±) = P{z±) = j^W, (9) 

with iV = 4e/i + afi + 76. Then the average number of changes of direction 
(K) and the entropy S can be calculated with (J4]) 



— = 1 - - £<**,*> = (« + 7 + 

n 71 ^ AT 
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- = -p(x+) [(1 - 4e) ln(l - 4e) + 4elne] 
n 

-p(x—) [(1 - 4/i) ln(l - 4/i) + 4// In //] 



-4p(y+) 



1 



7 - 29) ln(l - a - 7 - 20) 



+a In a + 7 In 7 + 29 In 



(10) 
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where finite size corrections are ignored in the expression for the entropy. We 
now introduce an extra asymmetry in the model. We use two different lattice 
parameters a en b. If the walker goes straight on, the length of the step is a, 
if the walker changes direction, the length of the step is b. The x-component 
of the endposition of the walk is 

X = d(k x +,x+ k x — t x—) ~t~ b (ki,x+ ki >x —) (H) 

with T' = T \ {x+, x—}. The average of x becomes 



^ = |f Ml - *0 - 7e(l - 4/i)] + ^4e/i(a - 7 ). (12) 

The averages of the y- and z-component of the endpostion of the walk vanish 
because of the imposed symmetry. 



3 Thermodynamic model 



In this section, we follow the lines of [13J to obtain expressions for the ther- 
modynamic control parameters of interest as a function of e, /i, a, 7, 9. In 
experiments one measures force-extension relations of biopolymers at constant 
temperature. To establish the connection between the random walk and the 
polymer, we interpret the endposition of the walk as the extension of the 
molecule. The corresponding control parameter is the force F applied to the 
endpoint of the molecule. The Hamiltonian is defined as H = hK, with K the 
number of changes of direction and h a constant with dimensions of energy. 
Depending on the sign of h, the ground state of the system is a compact walk 
that always changes direction (h < 0) or a stretched walk that always goes 
straight on (h > 0). The reason why we define the Hamiltonian like this will 
become clear in the following sections. The control parameter corresponding 
with the energy E = h(K), is the temperature T. The Legendre transform of 
the entropy S is the free energy G 



G= min \ E -F(x)-±s\. (13) 

Explicit expressions for E = h(K), (x) and S as a function of the model 
parameters are known, see (fT0|) and (TP2l) . As a consequence, the solution of 
the set of equations 
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can be obtained in closed form 



p[-h + F(b-a)]=1n 



e 



a: 



hP = ]n 



l-4e0 

1 - a - 7 - 26 
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2(3Fb = \n 



2/3Fa = In - 



(1 - a - 7 - 2fl) 2 = (1 - 4e)(l - 4/i). 



(15) 



This set of equations can be inverted and has a unique physical solution for 
every value of j3 and F, see appendix|A] Then a plot of the average endposition 
of the walk as a function of the external force at constant temperature can 
be obtained, see figure [DA.. When h < 0, the average endpostion increases in 
two steps at low temperatures. At high temperatures this multi-step behaviour 
disappears. It is useful to study the average number of changes of direction and 
the entropy as a function of the force to understand this non-trivial behaviour, 
see figures [T]B and[T]C. 

The steep increase of the average endposition at low forces is a result of the 
degeneracy of the ground state of the model. Contrary to the one- dimensional 
case, the three-dimensional random walk has different compact configurations 
for which the number of changes of direction equals the total number of steps. 
At vanishing force, the system has no preference for any of these configura- 
tions which results in a vanishing average endposition and a high value of the 
entropy. At low force, the walk can lower its free energy by choosing the con- 
figuration with the largest extension without changing the number of changes 
of direction. As a consequence one observes an increase of (x), a decrease of 
S, while (K) remains approximately constant. 

The steep increase of the average endposition at intermediate force is also 
present in the one-dimensional random walk [T3]. It is the result of the com- 
petition between the folding energy E and the potential energy —F(x). In |13j . 
this sudden change of the average endposition is used to define the gradual 
transition from compact to stretched state of the walk. The boundary line is 
obtained by calculating the peak value of d(x)/dF at constant temperature. 
For the present model, the same criterion is used. Figure [2] shows the entropy 
as a function of force and temperature. The black solid line shows the bound- 
ary between the two phases and is obtained by a numerical calculation. An 
unexpected result is that the boundary line is an increasing function of the 
temperature at low temperatures. This is also observed in the one-dimensional 
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Fig. 2. Plot of the entropy as a function of the temperature and the force. The 
colour code is mentioned to the right. The black solid line, marks the gradual tran- 
sition from the compact phase to the stretched phases. The black dotted line is an 
approximation for the solid line, valid at low temperatures only, (h = —1, a = b = 1) 

persistent random walk [13] and in self avoiding random walks [T9f2"0] and is 
a consequence of a subtle asymmetry in the entropy that favours the com- 
pact phase above the stretched phase. This asymmetry can be seen in figures 
[Tp and [2j At low temperatures, the entropy is clearly not symmetric around 
F = 2. 

In the appendix [B] an approximate expression for the boundary line at low 
temperatures is calculated, under the assumptions h < 0, a < b < 2a. One 
obtains F = 2 + 2T + . . ., with h = — 1 and a = b = 1. Figure [2] shows the result 
of the latter expression together with the exact boundary line. The two lines 
coincide up to a temperature of approximately 0.5. An approximate formula 
for the force-extension relation in the transition region (6/2 < (x) /n < a) 
is also derived in appendix [Bl The result of the latter expression and the 
exact force-extension relation is shown in figure [3] for different values of the 
parameter h/3. Clearly the approximation becomes better for higher values of 
-hp. 



4 B-DNA to S-DNA transition 



When one applies a large force on double stranded DNA, one observes a tran- 
sition from the standard B-form of DNA (with length per base pair of 0.34 
nm) to a so called S-form of DNA (with length per base pair of 0.58 nm). 
Introduce Lb and L$, the contour length of B-DNA and S-DNA respectively. 
Then, divide the DNA chain in short segments that are in the B-form or S- 
form. Then define N s and A& p as the number of segments and the total number 
of base pairs of the DNA chain respectively. 

Now we use the three-dimensional walk as a model for the DNA chain. In 
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Fig. 3. The exact force-extension relation (solid lines) together with the approximate 
relation (dotted lines) obtained for different values of the parameter —h/3 (top) 
1;5;24.67 (bottom). For all curves following values of the remaining parameters are 
used a = 1.72, b = 2, h = -19.046. 

our interpretation, changing direction during the walk corresponds with an B- 
segment of the DNA chain, while going straight on during the walk corresponds 
with an S-segment. Now it becomes clear why we denned the Hamiltonian as 
H = hK with K the number of changes of direction of the walk and h a 
constant with dimensions of energy. When we chose the sign of h negative, 
the ground state of the model is the desired standard B-form of DNA. We 
introduced two different lattice parameters a and 6. If the walker changes 
direction, the length of the step is 6, if the walker goes straight on, the length 
of the step is a. As a consequence, the total number of steps of the walk n and 
the number of segments of the DNA chain N s are equal. The contour length 
of the B-DNA and the S-DNA are connected to the lattice parameters of the 
random walk by 



In [2], the force-extension curve of a single DNA molecule is measured. An 
overstretching transition at F ~ 65pN is observed. A theoretical model to de- 
scribe this transition is introduced in [6] . This model contains 2 posteriors and 
3 priors. The present model contains 3 parameters a, b and h. The ratio a/6 is 
considered to be a prior (like in [5]), while 6 and h are posteriors. In figure HJ 
the experimental data from [2] are shown together with the theoretical curves 
of [6j and the present model. The two models give an equally good description 
of the experimental data. The force-extension relation of the present model 
is obtained with a constant temperature T = 300K and with following values 
of the model parameters 6 = 2.9nm, a = 0.816, h = — 20.56pN. The typical 
length of one basepair in B-DNA is 0.34nm. As a consequence the contour 
length of B-DNA is equal to Lb = 0.34Nb p nm. Together with (|T6|) . one ob- 
tains Nb p /N s m 4. So the number of base pairs in one segment is of the order 
of 4. In [B] the values 1 and 10 are used for the number of base pairs in one 
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Fig. 4. Force-extension curves of a single double-stranded DNA molecule. (• • •): 
Experimental data from [2]. (solid, dotted line): Theoretical curves obtained with 
the present model and the model of [6] respectively. 

segment. This means that the two theoretical models ignore interactions that 
make the DNA molecule stiffer. This is not unexpected, because both models 
use very simple Hamiltonians. The good representation of the experimental 
data shows that the theoretical models are able to catch the essence of the 
overstretching transition when effective parameters are introduced. These ef- 
fective parameters, like the number of base pairs in one segment, are then used 
to artificially increase the stiffness of the molecule. 

The two theoretical models contain several adjustable parameters. So a good 
representation of one dataset is not really surprising. Therefore we test our 
model further by studying the salt and temperature dependence of the over- 
stretching transition. In figures [5] and [6] the results of our theoretical model 
are shown together with the experimental data of [IJ and [5] respectively. In 
[4J the salt dependence of the transition is studied while in [5] the temperature 
dependence is studied. The value of the prior a/b is equal to 0.86 for all the 
theoretical curves. The values of the posteriors are obtained by a least squares 
analysis. For every curve \ 2 is defined by 

2 ( (*^7 i, theory experiment) /-, y\ 



with m the number of datapoints of the curve. The values of the posteriors 
h/b and h/3 are fixed by minimising this expression of y 2 . The result can be 
found in tables [1] and [2j The sensitivity of the least square analysis is tested 
by the following procedure. We keep the value of one of the fit parameter 
fixed (at the value that minimises x 2 ) while varying the other. It turns out 
that the value of \ 2 is doubled after a variation of 10% in the parameter h/3 
and after a variation of 0.5% in the parameter h/b. This shows that the least 
squares analysis is more sensitive to variations in h/b than to variations in 
hp. Important to mention is that we used the exact formulas of appendix [A] 
for the least squares analysis, although the approximations of appendix [B] are 
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Fig. 5. This figure shows experimental [1] and theoretical force-extension relations 
obtained for different salt concentrations. The values of the salt concentration (in 
mM) of the experimental curves are (left) 10;25;50;100;250;500;1.000 (right). The 
values of the posteriors of the theoretical curves can be found in table [H 
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Fig. 6. This figure shows experimental [5] and theoretical force-extension relations 
obtained at different temperature. The values of the temperatures (in K) of the ex- 
perimental curves are (left) 313;308;304;294;284 (right). The values of the posteriors 
of the theoretical curves can be found in table [2j 

very reliable for the values of the parameters used in figures [5] and [6j This 
can be seen in figure [3] where we included a curve with the same values of the 
adjustable parameters as in the far left curve of figure El 

The theoretical curves are only shown from (x)/Lb ~ 1 to (x) / Lb ~ 1.7, 
because the salt dependence (and temperature dependence) at low forces is 
not well described by the present model. The model contains three parame- 
ters a, b and h. Roughly these parameters determine the factor (Lg/L B ) by 
which the contour length is increased, the value of the overstretching force 
and the steepness of the transition. When the concentration of salt in the sol- 
vent increases, DNA becomes more stable. As a consequence the value of the 
overstretching force and the steepness of the transition increase, while L$/Lb 
remains the same. This can be captured by the present model and results in 
a decrease of value of the fit parameters h(3 and h/b (see table [T]). A closer 
look at the experimental data of [I] shows that the force-extension behaviour 
at low forces depends less on the salt concentration than at intermediate and 
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C (mM) 


10 


25 


50 


100 


250 


500 


1.000 


-hp 
-h/b ( P N) 


24.67 
19.04 


27.54 
20.49 


44.75 
21.54 


46.21 
22.18 


49.44 
23.21 


51.81 
23.80 


52.95 
24.29 



Table 1 

The values of the posteriors of the theoretical curves of figure 



T(K) 


313 308 304 294 284 


-hp 
-h/b (pN) 


20.15 21.01 26.65 45.45 37.01 
20.22 21.33 22.16 23.44 24.05 



Table 2 

The values of the posteriors of the theoretical curves of figure [6l 

high forces. This cannot be captured by the present model because the low 
and high force behaviour cannot be changed independently. In j3], only the 
influence of Na + on the overstretching transition is studied. The dependence 
of the transition on multivalent cations like Mg 2+ is studied in [21J. A clear 
change in the low force behaviour is observed when Mg 2+ is added to the 
solution. Our model contains less adjustable parameters in comparison with 
the other theoretical models studied in the literature. This leaves room for ex- 
tending our model with one extra parameter which can capture the low force 
behaviour for different solvent conditions. 

Figure [7] shows the value of the parameter h/b as a function of the logarithm 
of the salt concentration. The error bars in this figure show when the value 
of x 2 is doubled by variation of h/b while keeping h(3 fixed. The value of the 
parameter h/b is approximately a linear function of the logarithm of the salt 
concentration. 

As already mentioned in the introduction, one can assume that force-exten- 
sion relations are measured under equilibrium conditions when no hysteresis 
is observed. In [I] one obtains force-extension relations for overstretched DNA 
at different salt concentrations. The authors mention that they always observe 
hysteresis in their experiments at low salt concentrations (< 250mM). In [5] 
one obtains force-extension relations for overstretched DNA at different tem- 
peratures. The authors mention that the stretch and release curves coincide 
at room temperature, but that the difference between these curves is already 
18pN at 308K. We conclude that one can assume that 3 curves of figure 
(250mM, 500mM and l.OOOmM) and 2 curves of figure E (294K and 284K) are 
obtained under equilibrium conditions. The assumption of equilibrium is less 
reliable for the other curves of these figures. It is also important to mention 
that all curves shown in figures [5] and [6] are obtained during the stretch cycle 
of the experiment [4"15] . 
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5 Titin 



Titin is a long protein that contains approximately 30.000 amino acids. In the 
literature, the dynamic force-extension relation of this protein is extensively 
studied [10]. In [9], the static (equilibrium) force-extension relation of titin is 
also measured. 

The protein can be divided into two parts, the I-band and the A-band. At 
relatively low forces, the major contribution to the elasticity of titin comes 
from the I-band. Only at very large forces, the A-band becomes important. 
For that reason we consider only the I-band in this article. The I-band consists 
of a PEVK region (that contains 1.000 — 2.200 amino acids) that is flanked 
by 70 — 90 immunoglobulin domains [9]. These immunoglobulin domains are 
folded parts of the protein containing approximately 100 amino acids. Without 
applied force, the chain of amino acids of the PEVK region and the chain of 
immunoglobulin domains behave like random coils. When one applies a small 
force, the PEVK region immediately stretches out to almost its complete con- 
tour length. Meanwhile, the immunoglobulin domains line up (without unfold- 
ing) in the direction of the applied force. At high forces, the immunoglobulin 
domains will unfold one by one to further increase the contour length of the 
polymer. 

We consider the PEVK region and the immunoglobulin domains as two com- 
pletely independent parts of the protein. The average extension of the complete 
protein (x) is then simply the sum of the average extension of the immunoglob- 
ulin domains (xj) and the average extension of the PEVK region (x p ) 



(x) = (xi) + (x p ). (18) 

The indices i and p are used for the variables of the immunoglobulin domains 
and the PEVK region respectively. A similar formula for the entropy of the 
complete system holds S = Si + S p . The Hamiltonian of the complete sys- 
tem is then H = h-iK-i + h p K p . The sign of hi is negative because a folded 
immunoglobulin domain should be energetically favourable. The sign of h p is 
positive because the amino acids of the PEVK region behave like a random 
coil. The free energy of the complete system is 



G= min {H-F(x)-PS} 

ei,fj,i,ai,-yi,9i,e p ,fip,a p ,-yp,6 p 

= min {hi(K t )-F( Xl )-(3Si} 
+ min {h p (K p } - F(x p ) - f3S p } . (19) 

The results of these two minimalisations can be calculated and are given by 
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Fig. 7. The value of the posterior h/b as a function of the logarithm of the salt 
concentration. 
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Fig. 8. This figure shows the experimental [9] (dotted lines) force-extension relation 
of the protein titin together with the theoretical results (solid line) of the present 
model. The values of the adjustable parameters are a p = b p = 0.35nm, ai = 35nm, 
bi/2 = 5nm, n p = 1995, m = 80, hi/bi = -180pN, h p /hi = -3.2 and 0hi = -0.42. 

( TT5i) . Figure M shows the experimental data of [9] together with the theoretical 
results of the present model. The model contains several adjustable param- 
eters. Following parameters can be considered as priors a p = b p = 0.35nm 
(typical length of one amino acid), a« = 35nm (typical length of an unfolded 
immunoglobulin domain containing 100 amino acids). The length of a folded 
domain (bi/2) is less known. Figure M is obtained with the choice bi/2 = 5nm. 
This value is well within the range of experimental observations [9]. The val- 
ues of the remaining parameters are n p = 1995, rti = 80, hi/bi = — 180pN, 
h p / hi = —3.2 and j3hi = —0.42. Note that the values of n p and Hi are in the 
range of experimental observations [9]. With so many adjustable parameters it 
is no surprise that figure M shows a very good agreement between the theoreti- 
cal and experimental results at low and intermediate forces. To the best of our 
knowledge, a comprehensive experimental study of the static force-extension 
relation of titin is not yet available in the literature. So an in depth comparison 
between theoretical and experimental results is not yet possible. 
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6 Discussion 



To summarise, we study the three-dimensional persistent random walk with 
drift. We obtain analytical formulas for the average of the macroscopic vari- 
ables of interest as a function of the model parameters. Finally, the result- 
ing thermodynamic model is used to study polymer stretching experiments. 
We derive force-extension relations for two different polymers, overstretched 
double-stranded DNA and the protein titin. The molecules of double-stranded 
DNA and of the protein titin are non-oriented. Therefore, each walk and its 
mirror images must have the same probabilities. This is indeed the case in 
absence of an external force. However, in the presence of an applied force this 
is only the case when the distance between the endpoints in the direction of 
the force does not change under mirroring. We imposed this symmetry in the 
definition of the transition probabilities (|7|). The substitution y+ «-> y— has no 
effect on the transition probabilities, contrary to the substitution x+ <-> x—. 

Note that we did not assume the Boltzmann-Gibbs form for the equilibrium 
probability distribution. Rather, following [13J, we define the temperature by 
calculating the Legendre transform of the entropy. The results of [13J and the 
present paper show that it is possible to construct a thermodynamic model 
without the assumption of the Boltzmann-Gibbs distribution. Moreover, we 
believe that the results of the present paper prove that our approach is more 
than a mathematical exercise and can be used to study real physical systems. 
Figures [T] and [2] show that the present model contains three separate phases, 
the random phase (high temperature), the stretched phase (high force) and the 
compact phase (low temperature and force). In principle a true phase transi- 
tion can occur in our finite model because we did not assume the Boltzmann- 
Gibbs form for the equilibrium distribution. To be sure that only gradual 
transitions show up in our model, we checked that the free energy is a convex 
function of the control parameters. The transition between the compact and 
the stretched phase is a gas-liquid-like transition. The boundary line between 
these two phases ends in an approximate triple point at which the peak in 
d(x)/dF disappears. 

The applicability of our thermodynamic model is limited to experiments that 
are performed under equilibrium conditions. Force-extension relations are mea- 
sured under these circumstances when no hysteresis is observed. The over- 
stretching transition of DNA under equilibrium conditions has been studied 
more extensively than the force-extension relation of titin. For this reason, the 
comparison between experimental and theoretical results is more comprehen- 
sive for the former than for the latter in the present paper. Figure [8] shows that 
our model gives at least a qualitative representation of the experimental static 
force-extension relation of the protein titin [9] . Figures [5] and [6] show that our 
model gives a consistent description of the overstretching transition of DNA 
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at intermediate and high forces. This means that the force-extension relations 
obtained with our theoretical model fit well to the experimental curves for 
varying environmental conditions (different salt concentrations [I] and tem- 
peratures [5]). In the literature different analytically solvable models are in- 
troduced to study the overstretching transition. The model [8] proposed by 
Punkkinen et al. is the only one that is tested against different datasets, ob- 
tained under varying salt concentrations. It gives an adequate description over 
the whole range of the force-extension curve and not only in the transition re- 
gion. An important difference between our model and the model of Punkkinen 
et al. is that the latter model contains more adjustable parameters than our 
model. This leaves room to extend our model with some extra parameter in 
order to correct the behaviour at small forces. 

In figure [2] the positive slope of the boundary line, between the compact and 
the stretched phase of the random walk, is clearly visible. It is an open ques- 
tion whether this so called reentrant behaviour can be observed in any of the 
experiments that we discussed in the present paper. In [22], phase diagrams 
of the force-induced unfolding of single-domain proteins (e.g. immunoglobulin 
domains of titin) are obtained by numerical simulations. These phase diagrams 
show also reentrant behaviour. As already pointed out in [22] , the interaction 
between the solvent and the biopolymer depends on the temperature. As a 
consequence, the phase diagrams of the present paper and [22J are only re- 
liable for the study of the stretching experiments on biopolymers for small 
temperature variations. In the present model, the value of the parameter h is 
determined by the interaction between the solvent and the biopolymer. This 
parameter is an effective parameter. Therefore its value can still depend on 
the temperature (see figure [6] and table [2]). 

In principle one can nowadays perform two types of stretching experiments 
in two different ensembles. In one type of experiment, the two ends of the 
molecule are held at fixed positions and the fluctuating force is measured (the 
fixed-stretch ensemble). In the other type of experiment, the force applied to 
the endpoints of the molecule is kept constant and the fluctuating extension 
is measured (the fixed- force ensemble). Theoretical calculations [T2"ir2"3] show 
that the force-extension relations obtained in the two ensembles do not coin- 
cide in general, although the differences are small and disappear in the long 
chain limit. The results of the present paper are limited to the fixed-force en- 
semble, although the discussed experiments are performed in the fixed-stretch 
ensemble. In |T2,24J, it is shown how the results of the fixed-force ensemble 
can be extended to the fixed-stretch ensemble for the one-dimensional per- 
sistent random walk with drift. Similar calculations for the three-dimensional 
random walk are not yet possible because an explicit expression for the joint 
probability distribution p n (x, k) is still lacking [1~1~][1"2"] . This is the probability 
to end in position x with k changes of direction after n steps. Based on the 
results of the one-dimensional walk [T2] we assume that one can ignore the 
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differences between the two ensembles for the long polymers which are studied 
in the present paper. 

Our results prove that the persistent random walk with drift is an interesting 
model. Two possible extensions are already suggested above. One can try to 
improve the applicability of the random walk to the overstretching transition 
of DNA by introducing one more parameter in the model. One can also try 
to find an explicit expression for the joint probability distribution p n (x,k) 
along the lines of [UJ . This will not only allow to calculate the force-extension 
relation in the fixed-stretch ensemble [T2], but also to quantify the deviations 
from the Boltzmann-Gibbs distribution Finally, one can also try to extend 
the applicability of the present model to experiments that are performed under 
non-equilibrium conditions. To this purpose one can assume, along the lines of 
superstatistics |25j . that the model parameters are stochastic variables which 
have some probability distribution. The major problem is then to determine 
this time-dependent probability distribution. 



A 



Introduce following shorthand notation 



exp ((3 [-h + F(b-a))) 

■ exp (2(3Fb) 

■ exp (2(3Fa) 

■ exp (h/3) . 



(A.1) 



One can calculate explicit expressions for 7, 9, /i and a as a function of e only 



If l-4e' 

T\ (1-46) 2 
a = 1= 

7= i_ a _l±_J(i_ 4e ) 



6- 



1 



;i-4e). 



Using these expressions, one is left with following equation for e 



(A.2) 



e(l-4e) i 



1 



l-4e\ T, 



Ti 



Ti(l-4e) i 
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T 4V /7k-(2 + T 4 )(l-4e)e 



(A.3) 



This is a cubic equation that can be solved in closed form. It was checked 
numerically that is has always a unique physical solution. 



B 



Assume h < 0, a < b < 2a and (3 — > oo. Then, the cubic equation in e (IA.3I) 
and the formula for the average endposition f[T2l can be approximated in the 
transition region (6/2 < (x)/n < a) by following expressions 



Tl (l-4e) 2 andM = fl + 4e ( & - a ) . (B .l) 



T4VT3 n 1 + 4e 

The first equation is now a quadratic equation in e. It has only one physical 
solution (e G [0..1/4]). Inserting this solution in the second equality of fIB.ll) 
results in an expression for (x) as a function of f3 and F. Then one can cal- 
culate the second derivative of the average end-to-end distance with respect 
to the force at constant temperature. This second derivative equals zero if the 
following equation holds 



F { l - 2 i)h 2 -h i ^- (R2) 

This is a low temperature approximation for the boundary line between the 
compact and stretched phases. The factor 3 In 2 is clearly a contribution due 
to the entropy. The expressions (IB. 1[) can also be used to obtain a formula for 
the force as a function of (x) and (3. The result is 



- Fl\ = 2 + -L [ In (I + X ) + hi (Z - x) - 2 In* - 4 In 2 

h hp . 

with x — 2(x)/nb — 1 and / = 2a/b — 1. 



(B.3) 
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